-
Notifications
You must be signed in to change notification settings - Fork 3
Feature: read_parquet_mergetree #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Conflicts: # CMakeLists.txt # chsql/src/default_table_functions.cpp # src/chsql_extension.cpp
Hey @carlopi any chance you or someone in the team knows how to get around the windows build error? 🙏 |
Can you try to reduce the diff? Or try to copy the setup of extensions like duckdb_delta. |
@carlopi Is it enough if I tell you that the real change is only in the file https://github.com/lmangani/duckdb-extension-clickhouse-sql/pull/13/files#diff-c5bffd6b887e2ced50224f44652dab784c9c7f7ab8c46a390410cc58490391ed ? The other changes are just internal insignificant file moves. Or do you need a separate PR with the function implementation? |
Then it's likely either a |
@carlopi Aaah . It's about the windows build problem. From the MSVC++ linker logs I see that somehow the linker wants to link
Have no idea why it wants to link the same |
screenshot of @akvlad kicking the windows builder where it hurts 😄 |
amazing work @akvlad lets merge and proceed with some field testing 🎉 |
read_parquet_mergetree
Description
The
read_parquet_mergetree
chsql function provides a familiar interface for ClickHouse users by emulating aspects of the MergeTree engine strategy. Its primary purpose is to efficiently merge multiple parquet files using a specified primary SORT key - without consuming excessive memory and facilitating fast range queries on the resulting file.Syntax
Features
Parameters
FILE_ARRAY[]
: An array of file paths to mergePRIMARY_SORT_KEY
: Specifies the column(s) used as the primary sort key for merging and ordering dataBenchmark